Spam Filtering using Contextual Network Graphs

ثبت نشده
چکیده

This document describes a machine-learning solution to the spam-filtering problem. Spam-filtering is treated as a text-classification problem in very high dimension space. Two new text-classification algorithms, Latent Semantic Indexing (LSI) and Contextual Network Graphs (CNG) are compared to existing Bayesian techniques by monitoring their ability to process and correctly classify a series of spam and non-spam documents. LSI and CNG algorithms have an advantage over the Naïve Bayes classifier in the domain of natural language processing as they include a representation of ‘context’, or relations between terms. Both LSI and CNG take advantage of these relations to offer a conceptual or semantic-based search, which has been adapted in this paper to the domain of spamfiltering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spam Source Clustering by Constructing Spammer Network with Correlation Measure

Spam filtering is one of the most challenging problems in electric message systems. In general, recent studies on specifying real spam source are based on content filtering because spammers usually falsify their origin. We propose a method to specify spam source based on structural analysis with complex network. We assume that each spam sources either has the same victim list or uses the same s...

متن کامل

Filtering Network Spam Message using Approximated Logistic Regression

The development of telecom network and Internet provides effective ways for communication. As an important way in communication, Short Messaging Service (SMS) via both telecom network and Internet has played an increasing important role in daily life. However, it usually suffers from spam SMS that causes misunderstanding and cheat. The highly varying content, network environment make the identi...

متن کامل

Detecting Image Spam Using Image Texture Features

Filtering image email spam is considered to be a challenging problem because spammers keep modifying the images being used in their campaigns by employing different obfuscation techniques. Therefore, preventing text recognition using Optical Character Recognition (OCR) tools and imposing additional challenges in filtering such type of spam. In this paper, we propose an image spam filtering tech...

متن کامل

Chinese Spam Filtering Based On Back-Propagation Neural Networks

As the email service is becoming an important communication way on the Network, the spam is increasing every day. This paper describes a new filtering model based on email content by using Back-Propagation Neural Networks (BPNN). And for the Chinese email, it uses Natural Language Processing & Information Retrieval Sharing Platform (NLPIR) system to perform Chinese word segmentation. The simula...

متن کامل

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004